Search CORE

296 research outputs found

CMS software deployment on OSG

Author: Avery Paul
Kim Bockjoo
Thomas Michael
Wuerthwein Frank
Publication venue: 'AIP Publishing'
Publication date: 31/07/2008
Field of study

A set of software deployment tools has been developed for the installation, verification, and removal of a CMS software release. The tools that are mainly targeted for the deployment on the OSG have the features of instant release deployment, corrective resubmission of the initial installation job, and an independent web-based deployment portal with Grid security infrastructure login mechanism. We have been deploying over 500 installations and found the tools are reliable and adaptable to cope with problems with changes in the Grid computing environment and the software releases. We present the design of the tools, statistics that we gathered during the operation of the tools, and our experience with the CMS software deployment on the OSG Grid computing environment

Caltech Authors

Designing Computing System Architecture and Models for the HL-LHC era

Author: Bauerdick Lothar
Bockelman Brian
Elmer Peter
Gowdy Stephen
Tadel Matevz
Wuerthwein Frank
Publication venue: 'IOP Publishing'
Publication date: 20/07/2015
Field of study

This paper describes a programme to study the computing model in CMS after the next long shutdown near the end of the decade.Comment: Submitted to proceedings of the 21st International Conference on Computing in High Energy and Nuclear Physics (CHEP2015), Okinawa, Japa

arXiv.org e-Print Archive

CERN Document Server

Characterizing network paths in and out of the clouds

Author: Graham John
Sfiligoi Igor
Wuerthwein Frank
Publication venue: 'EDP Sciences'
Publication date: 04/11/2019
Field of study

Commercial Cloud computing is becoming mainstream, with funding agencies moving beyond prototyping and starting to fund production campaigns, too. An important aspect of any scientific computing production campaign is data movement, both incoming and outgoing. And while the performance and cost of VMs is relatively well understood, the network performance and cost is not. This paper provides a characterization of networking in various regions of Amazon Web Services, Microsoft Azure and Google Cloud Platform, both between Cloud resources and major DTNs in the Pacific Research Platform, including OSG data federation caches in the network backbone, and inside the clouds themselves. The paper contains both a qualitative analysis of the results as well as latency and throughput measurements. It also includes an analysis of the costs involved with Cloud-based networking.Comment: 7 pages, 1 figure, 5 tables, to be published in CHEP19 proceeding

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Running a Pre-Exascale, Geographically Distributed, Multi-Cloud Scientific Simulation

Author: Riedel Benedikt
Schultz David
Sfiligoi Igor
Wuerthwein Frank
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 16/02/2020
Field of study

As we approach the Exascale era, it is important to verify that the existing frameworks and tools will still work at that scale. Moreover, public Cloud computing has been emerging as a viable solution for both prototyping and urgent computing. Using the elasticity of the Cloud, we have thus put in place a pre-exascale HTCondor setup for running a scientific simulation in the Cloud, with the chosen application being IceCube's photon propagation simulation. I.e. this was not a purely demonstration run, but it was also used to produce valuable and much needed scientific results for the IceCube collaboration. In order to reach the desired scale, we aggregated GPU resources across 8 GPU models from many geographic regions across Amazon Web Services, Microsoft Azure, and the Google Cloud Platform. Using this setup, we reached a peak of over 51k GPUs corresponding to almost 380 PFLOP32s, for a total integrated compute of about 100k GPU hours. In this paper we provide the description of the setup, the problems that were discovered and overcome, as well as a short description of the actual science output of the exercise.Comment: 18 pages, 5 figures, 4 tables, to be published in Proceedings of ISC High Performance 202

arXiv.org e-Print Archive

Crossref

The Scalable Systems Laboratory: a Platform for Software Innovation for HEP

Author: Bryant Lincoln
Chien Andrew
Gardner Robert
Neubauer Mark
Stephen Judith
Wuerthwein Frank
Publication venue: 'EDP Sciences'
Publication date: 13/05/2020
Field of study

The Scalable Systems Laboratory (SSL), part of the IRIS-HEP Software Institute, provides Institute participants and HEP software developers generally with a means to transition their R&D from conceptual toys to testbeds to production-scale prototypes. The SSL enables tooling, infrastructure, and services supporting the innovation of novel analysis and data architectures, development of software elements and tool-chains, reproducible functional and scalability testing of service components, and foundational systems R&D for accelerated services developed by the Institute. The SSL is constructed with a core team having expertise in scale testing and deployment of services across a wide range of cyberinfrastructure. The core team embeds and partners with other areas in the Institute, and with LHC and other HEP development and operations teams as appropriate, to define investigations and required service deployment patterns. We describe the approach and experiences with early application deployments, including analysis platforms and intelligent data delivery systems

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Moving the California distributed CMS xcache from bare metal into containers using Kubernetes

Author: Balcas Justas
Davila Diego
Fajardo Edgar
Guiang Jonathan
Sfiligoi Igor
Tadel Alja
Tadel Matevz
Wuerthwein Frank
Publication venue: 'EDP Sciences'
Publication date: 04/03/2020
Field of study

The University of California system has excellent networking between all of its campuses as well as a number of other Universities in CA, including Caltech, most of them being connected at 100 Gbps. UCSD and Caltech have thus joined their disk systems into a single logical xcache system, with worker nodes from both sites accessing data from disks at either site. This setup has been in place for a couple years now and has shown to work very well. Coherently managing nodes at multiple physical locations has however not been trivial, and we have been looking for ways to improve operations. With the Pacific Research Platform (PRP) now providing a Kubernetes resource pool spanning resources in the science DMZs of all the UC campuses, we have recently migrated the xcache services from being hosted bare-metal into containers. This paper presents our experience in both migrating to and operating in the new environment

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Caltech Authors

Parallelized and Vectorized Tracking Using Kalman Filters with CMS Detector Geometry and Events

Author: Cerati Giuseppe
Elmer Peter
Gravelle Brian
Hall Allison Reinsvold
Kortelainen Matti
Krutelyov Vyacheslav
Lantz Steven
Lefebvre Matthieu
Masciovecchio Mario
McDermott Kevin
Norris Boyana
Riley Daniel
Tadel Matevz
Wittich Peter
Wuerthwein Frank
Yagil Avi
Publication venue: 'EDP Sciences'
Publication date: 09/07/2019
Field of study

The High-Luminosity Large Hadron Collider at CERN will be characterized by greater pileup of events and higher occupancy, making the track reconstruction even more computationally demanding. Existing algorithms at the LHC are based on Kalman filter techniques with proven excellent physics performance under a variety of conditions. Starting in 2014, we have been developing Kalman-filter-based methods for track finding and fitting adapted for many-core SIMD processors that are becoming dominant in high-performance systems. This paper summarizes the latest extensions to our software that allow it to run on the realistic CMS-2017 tracker geometry using CMSSW-generated events, including pileup. The reconstructed tracks can be validated against either the CMSSW simulation that generated the hits, or the CMSSW reconstruction of the tracks. In general, the code's computational performance has continued to improve while the above capabilities were being added. We demonstrate that the present Kalman filter implementation is able to reconstruct events with comparable physics performance to CMSSW, while providing generally better computational performance. Further plans for advancing the software are discussed

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Hadoop distributed file system for the Grid

Author: Attebury Garhan
Baranovski Andrew
Bloom Ken
Bockelman Brian
Kcira Dorian
Letts James
Levshina Tanya
Lundestedt Carl
Maier Will
Martin Terrence
Pi Haifeng
Rana Abhishek
Sfiligoi Igor
Sim Alexander
Thomas Michael
Wuerthwein Frank
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2009
Field of study

Data distribution, storage and access are essential to CPU-intensive and data-intensive high performance Grid computing. A newly emerged file system, Hadoop distributed file system (HDFS), is deployed and tested within the Open Science Grid (OSG) middleware stack. Efforts have been taken to integrate HDFS with other Grid tools to build a complete service framework for the Storage Element (SE). Scalability tests show that sustained high inter-DataNode data transfer can be achieved for the cluster fully loaded with data-processing jobs. The WAN transfer to HDFS supported by BeStMan and tuned GridFTP servers shows large scalability and robustness of the system. The hadoop client can be deployed at interactive machines to support remote data access. The ability to automatically replicate precious data is especially important for computing sites, which is demonstrated at the Large Hadron Collider (LHC) computing centers. The simplicity of operations of HDFS-based SE significantly reduces the cost of ownership of Petabyte scale data storage over alternative solutions

Crossref

Caltech Authors

HEP Community White Paper on Software trigger and event reconstruction

Realizing the physics programs of the planned and upgraded high-energy physics (HEP) experiments over the next 10 years will require the HEP community to address a number of challenges in the area of software and computing. For this reason, the HEP software community has engaged in a planning process over the past two years, with the objective of identifying and prioritizing the research and development required to enable the next generation of HEP detectors to fulfill their full physics potential. The aim is to produce a Community White Paper which will describe the community strategy and a roadmap for software and computing research and development in HEP for the 2020s. The topics of event reconstruction and software triggers were considered by a joint working group and are summarized together in this document.Comment: Editors Vladimir Vava Gligorov and David Lang

arXiv.org e-Print Archive

DESY Publication Database

HAL-IN2P3

Royal Holloway - Pure

DESY

CERN Document Server